Modeling Web
Applications with UML
Introduction
Web applications are becoming more and more popular. This is in part due to
the rapid deployment of the tools and technologies for developing them. But
mostly because system designers are recognizing the situations where web
applications have very significant advantages over traditional applications.
To date the focus of web application development has been the tools. Little
attention has been paid to the development process. Current development
environments make it so easy to produce simple web applications that they have
the unfortunate side effect of encouraging us to develop and evolve applications
in the absence of serious analysis and design. Any system with non-trivial
complexity needs to be designed and modeled. A primary goal of this paper is to
stress the need for proper modeling. Unfortunately modeling web applications is
not obvious.
Most of my experience with web applications comes from my use of Microsoft’s
Active Server Page technology. I have made a very strong attempt to make the
discussions in this paper as generic as possible, so that other technologies
such as Allaire’s Cold Fusion could also be used. In addition web applications
based on CGI scripts or ISAPI filters can still benefit from the modeling
techniques described here.
This paper begins with a very brief explanation of web architecture. To keep
it short, discussions of images, image maps and other primarily user interface
features of web sites and applications have been omitted. The focus is on the
ability of a web application to function as a software application, and the
details of display and formatting are only of interest in those special cases
where they directly effect the business logic. Also not addressed is use of
external applications and MIME types.
Discussions in this paper assume working knowledge of UML, and some knowledge
of web applications. The audience of this paper are developers and designers of
web applications, and it is assumed that they are familiar with their own web
application development environments.
Web Application Architecture
Web Sites
There is a subtle distinction between a web application and a web site. For
the purpose of this paper a web application is a web site where user input
(navigation through the site and data entry) effects the state of the business
(beyond of course access logs and hit counters). In essence a web application
uses a web site as the front end to a more typical application.
The architecture for a web site is rather straight forward. It contains three
principal components; a web server, a network connection and one or more client
browsers. The web server distributes pages of formatted information to clients
that request it. The request is made over a network connection and uses the HTTP
protocol. Figure 1 shows this relationship. Some web sites require clients to
logon, and some allow anonymous access.
Figure 1 Basic Web Application Architecture
The information made available by a web site is typically stored, already
formatted, in files. Clients request files by name, and when necessary provide
specific path information with the request. These files are termed pages, and
represent the content of a web site.
In some situations the content of a page is not necessarily stored inside the
file. It can be assembled at runtime from information stored in a database (or
other information repository) and formatting instructions in a file.
Alternatively it can come from the output of a load-able module (CGI or ISAPI).
The web server uses a page filter to interpret and execute the scripts in the
page. Web sites employing this strategy are called dynamic sites.
Figure 2 Dynamic Web Site Architecture
Dynamic web sites offer certain advantages to web site designers. They make
it easy to keep the content fresh and synchronized with data in a database. The
overall look and feel of the web site is defined by a set of pages that contain
code executed by the web server during a request for this page. In this context
the file can either be an plain text file with scripts interpreted by the web
server, or a compiled binary file that is executed by the web server. In either
case the code in the "page" references and utilizes server resources which
include databases, email services, file services, etc.
A user interacts with a web site via a browser. A browser is an application
that runs on a client machine, that connects to a server on a network and
requests a page of information. Once the page request has been fulfilled the
connection terminates. The browser knows how to communicate (via HTTP) to a web
server, and how to render formatted information returned by the web server. Most
pages of information contain links to other pages (possibly on other
servers), which the browser user may easily request. Users navigate the web by
clicking on links and requesting pages from web servers.
Web Applications
The distinction between web sites and web applications is subtle, and relies
on the ability of a user to effect the state of the business logic on the
server. Certainly if no business logic exists on a server, the system should not
be termed a web application. For those systems where the web server (or an
application server that uses a web server for user input) allows business logic
to be effected via web browsers, the system is considered a web application. For
all but the simplest web applications the user needs to impart more than just
navigational request information, typically web application users enter a varied
range of input data. This data might be simple text, check box selections, or
even binary and file information.
The distinction becomes even more subtle in the case of search engines, where
users do enter in relatively sophisticated search criteria. Search engines that
are web sites, simply accept this information, use it in some form of database
SELECT statement and return the results. When the user finishes using the system
there is no noticeable change in the state of the search engine (except of
course in the usage logs and hit counters). This is contrasted with web
applications that, for example accept on-line registration information. A web
site that accepts course registration information from a user has a different
state when the user finishes using the application.
The overall architecture of a web application is identical to that of a web
site. It can however, become significantly more elaborate. The rest of this
section will attempt to gradually build on the web site architecture to
eventually reach that of a fairly complete and complex one. Given the history of
this industry, this classification of fairly complete may not even last by the
time this paper is completed. Regardless it will contain most of the concepts
and components that are expected to be the cornerstones of web applications for
the next several years.
Pages
By far the most fundamental component of a web application is the page.
Browsers request pages (or conceptual pages) from servers. Web servers
distribute pages of information to browsers. The makeup and organization of a
web pages in essence make up the user interface for the application. In web
applications the browser acts as a generalized user interface container with
specific user interfaces being defined by each page’s content.
In web application development environments like Microsoft’s Active Server
Pages or Allaire’s Cold Fusion, the pages are a combination of static HTML
formatted pages, and dynamic scripted pages. The scripted pages contain code
that is executed by the web server (actually it is more likely to be delegated
to a scripting engine or page filter) that accesses server resources to
ultimately build an HTML formatted page. The newly formatted page is sent back
to the browser that requested it.
Server Scripting
It is important to note that the connection between the client and server
only exists during a page request. Once the request is fulfilled the connection
is broken. All activity on the server (as effected by the user) occurs during
the page request. This represents a very significant distinction between
traditional client server applications. Business logic on the server is only
activated by the execution of scripts inside the pages requested by the browser.
Depending upon the specific scripting engine, scripted pages can contain user
defined variables, sub routines and functions. Some scripting engines even
permit the definition and interaction of objects.
The ultimate result of this server processing is to;
- update the business state of the server, and
- prepare an HTML formatted page (user interface) for the requesting
browser.
An important and subtle part of web application design is understand and
accommodating of this paradigm of client and server interaction. Business
objects are not always accessible when handling individual user interface
requests. For example a common user interface (and business feature) in many
client server applications is the automatic population of city and state fields
in a US postal address when a zip code is entered. Assuming that all three
fields are located on the same page in a browser this feature would require an
additional server page request to happen immediately after the zip code was
entered. For most web applications this carries with it an unacceptable
performance burden. For most web applications page requests are fulfilled in an
order of seconds instead of milliseconds.
Client Scripting
The server is not the only component in a web application that executes
scripts. The browser itself can execute scripted code in a page. When the
browser executes a script, however, it does not have direct access to server
resources. Typically scripts running on the client augment the user interface as
opposed to defining and implementing core business logic.
Scripts on the client are appropriate for immediate data validation, or in
assisting navigation. Often client scripts simply "jazz" up the user interface
and provide little if any business logic behavior. This is changing however, as
client side scripting becomes more powerful, and client side resource become
more a part of the overall application.
Client scripts should not be confused with client side components such as
Java Applets or ActiveX controls. These components are a separate category of
component in the overall web application architecture and are discussed in more
detail later. The client scripts discussed here are JavaScript (or VBScript)
code embedded in the HTML formatted page. The code is executed in response to
browser generated events (document loaded, button pressed, etc.). With the
acceptance of the new Dynamic HTML specification, client scripts can access and
control nearly every aspect of the page’s content. Additionally it further opens
up access to the browser object model itself, enabling client side scripts to
interact with other browser resources.
When an HTML web page is rendered in a browser it is first parsed and divided
up into elements. When dynamic HTML is employed each of the elements can be
named or assigned an ID, which could be referenced by client side scripts. Some
common element types include anchors (links to other pages), tables, font
specifications, etc. The elements making up the content of the page have an
object interface defined by the Document Object Model . The browser also has an
accessible interface, yet differing brands of browsers may have subtle
differences.
Client scripts, like their server side counterparts, may contain variable
declarations, sub routines and functions. The only major conceptual difference
is server side scripts contained in a page are inherently procedural, while
client side scripts are inherently event driven.
Forms
Any serious web application accepts more than navigation input from its
users. Web applications often illicit textual, selectable and Boolean
information. The most common mechanism for collecting this type of user input is
with HTML forms.
An HTML form is a collection of input fields that are rendered in a web page.
The basic input elements are; a textbox, text area, checkbox, radio button
group, and selection list.
All the input elements on a form are identified by name or ID. Each form is
associated with an action page. This action page represents the name (and
location) of the page that is to receive and process the information contained
in the completed form. The action page is almost always a dynamic page,
containing server side scripts (or compiled code).
When a form is completed by a user the user submits the form back to
the server with a page request for the action page. The web server finds the
page and interprets (or executes) the page’s code. The code in the page has the
ability to access any information in the form that was submitted with the
request. This is the major mechanism for obtaining user input in a web
application.
Components
Server
All business logic needn’t be interpreted from scripts in web pages. Larger
and more enterprise savvy web applications make use of a third middle tier of
components. This middle tier exist in between the user interface and the
persistence system, and is typically a set of compiled components that run on an
application server. The application server may execute on the same machine as
the web server but does not necessarily have to. One of the advantages of an
explicit third tier running on a server is the ability to share implementations
of business functions across applications, web or not. Another is an
encapsulation layer, for all business logic. A full description of the merits of
three tiered architecture is beyond the scope of this paper.
When a compiled middle tier of business objects is present in the
architecture the server scripts in the web pages, primarily act as the glue
between the user interface and the business layer. Scripts executed on behalf of
the web server have references to business objects, creating and invoking
methods on them. It should be noted that as before, server side business logic
is only executed on behalf of the user during page requests. The processing of a
page request may result in a process or two to get executed on the server, and
remain executing long after the client browser has shut down, but the process
would be running in isolation from the client. Even when compiled business
object components are used, the connection between the client and the server is
closed once the requested page has been received by the browser.
The correct decision to use a third tier or not is dependent upon the
specific application. Some points to consider include improved performance time
with compiled business objects. Interpreted scripts execute slower than compiled
code, and when performance is important, should be kept to a minimum. An
advantage of server scripted business logic, however, is an easier ability to
modify or enhance code, without taking down the application. Changes can be made
to server scripted pages (within reason) without stopping the application. If
the amount of business logic is relatively small, the absence of a full middle
can be very appropriate.
Client
HTML formatted web pages may also specify components for execution on the
client machine. The most common of these components are Java Applets, and
ActiveX controls. Each are self contained compiled components that run at the
request of the browser. Depending upon browser and component configuration they
have access to browser and/or client machine resources.
Components on the client raise significant security concerns which are better
addressed outside this paper. It should be sufficient to say that compiled
components executing in the browser and on the client can make up a significant
part of a web application’s architecture.
Client components are very useful in providing user interface functionality
not readily achievable with standard form or HTML elements. A client component
might be a display control that visualizes a three dimensional model. It might
also represent a user interface control to specify dates with, one that pops up
a miniature calendar from which dates can be clicked on. Some client components
have no visual display, and might be used to retrieve client machine
configuration information (a very touchy subject with the Internet
community).
Frames
The user interface capabilities on the client can be enhanced with the use of
frames. Frames and the ability to target browsers, enable the user
interface designer to have multiple web pages active and open at the same time.
The browser divides up its rectangular client window (where web pages are
rendered) into distinct frames (or sub rectangles). Any web page can specify a
frameset, and it is possible for a frameset to be embedded inside
other framesets.
Scripts and components in any of these pages can interact with scripts and
content in others. In HTML this is all managed by associating a target
with each frame in a browser window. A proper discussion of the use of frames in
user interface design is beyond the scope of these discussions, however the
point must be made that the use of frames does constitute a major design
decision in a web application, as it indicates that multiple pages are
simultaneously available to the user.
Figure 3. shows a summary of the major architectural components discussed
here. Two additional components; Application Dictionary and Session Dictionary
were influenced by Microsoft’s implementation of Active Server Pages and offer a
convenient mechanism to manage client side state on the server.
Figure 3 Model of a generalized web application architecture
Other Components
A very effective and complex web application can be developed with the
components mentioned previously. But nothing is good enough when technology is
concerned. Some of the latest developments in the web application area still
impact the architecture. Included in this are scriptlets and XML. Scriptlets are
presently a Microsoft browser exclusive and therefore only an option when it can
be guaranteed that all clients are using Internet Explorer 4.0 or later. The
other, XML is also pushed by Microsoft, but is receiving an unusual amount of
independent support.
A scriplet is a cached HTML page (with possible object references) on the
client that is used by many pages in a web application. The obvious advantages
are re-usability and lowered network traffic.
XML (eXtensible Markup Language) is subset of SGML (Standard Generalized
Markup Language) and defines how related data can be transmitted across the web
in a standardized way. Much like how HTML uses tags to describe the formatting
in a web page, XML enables user defined tags to describe the meta-structure of
data with the actual instance data as it is transmitted between server and
client. Again this is not the proper place to discuss in detail XML, the most
recent draft of the XML standard can be found on:
http://www.w3.org/pub/WWW/TR/WD-xml.html. For the purposes of this paper XML
defines a portable way to encode hierarchical data.
Modeling
Modeling is important. It helps us manage complexity. Web applications can
get complex rather quickly. A given system can be represented by many different,
yet consistent models. Each model has a specific purpose and audience. This
paper focuses on design models for web applications, and the audience is
primarily the web architect and designer. It is important when modeling to
capture the appropriate level of abstraction and to model the artifacts.
With these goals in mind the task of modeling web application design becomes
not so clear. Given the assumption that the primary artifact of a web
application is the web page, it should be obvious that the page should be
modeled. But how? Using UML we can express a page as an object. This then brings
up the question; what are the properties of such an object? Is it appropriate to
express the layout elements (fonts, tables, text, etc.)? Should the scripts in a
page be identified as methods of a page object?
The answer should come from the question; what is the model being used for,
and who is the audience? In the case of a design model, the user interface
formatting is irrelevant, and typically doesn’t effect the business logic in the
system. Scripts, especially server side scripts, do effect the business behavior
of the system (and in some systems represent the entirety of the system’s
business logic!). Additionally it is not hard to visualize variables in a
scripted page (those with page scope) as being attributes of a page object and
the function in the page as being its methods. These are appropriate for a
design model and for a web application designer.
This however, leads to another problem. Web pages can contain scripts for
both the server machine as well as the client. Intermixing attributes and
methods for server and client execution can be very confusing. Solving this
problem involves using a relatively new feature in the modeling toolbox;
extensions.
Modeling Extension
The designer’s of UML recognized that the language is not always perfect for
every situation. There are times when the development process would be better
served if additional information we captured, or different semantics were
applied to certain modeling elements. UML has defined a mechanism to allow
certain domains to extend the semantics of specific model elements. The
extension mechanism allows the inclusion of new attributes, different semantics
and additional constraints. When collected together as Tagged Values,
Stereotypes and Constraints they form an Extension to UML. This paper presents
an extension of UML for web application designs.
Part of the extension mechanism of UML is the ability to assign different
icons to stereotyped classes. A list of prototype icons for the most common
class stereotypes can be found as an appendix to this paper.
The problem of a web page having different scripts and variables executed on
the server or on the client can be solved in one of two ways. The first would be
to define the stereotypes; server method and client method. In a page object a
method that executes on the server will be stereotyped as «server method» and
functions that run on the client «client method». This solves the problem of
distinguishing attributes and methods of a page object, however it is still
confusing. A further complication arises later when associations are made to
other components in the model. It is not clear that some of these relationships
are valid only in the context of the server methods and attributes or on the
client.
Page Stereotypes
A better way to model a page is with two separately stereotyped classes;
server page and client page. Any given web page in a web application that has
functionality on the server as well as client can be represented in the model as
two separate classes, even though their implementation is in the same file (or
component). In this situation a web page’s server methods and page scoped
variables are all contained in a class in the model stereotyped «server page».
This class’s methods represent the page’s server side script’s; sub routines and
functions. Variables declared in the scripts that have page scope represents the
class’s attributes. Client side scripts or user interface formatting are not
part of a server page’s scope. A server page can have relationship’s to
components that exist on the server. These might include the business objects in
three tiered system’s or data access components. Components relative to the
application’s design and operation on the server are represented side by side
with the server page’s that use them.
Client pages are similarly represented on the diagram with classes
stereotyped: «client page». Client page’s attributes are page scoped variables
and functions that execute in the client browser. Client pages are associated
with components that execute on the client, including Java Applets, ActiveX
controls, and elements of the Document Object Model itself.
There is a fundamental relationship between the server and client stereotypes
of a web page. A server page ultimately builds the resulting client page.
This is a unidirectional relationship, since a completed HTML page has little
access to the object interface of the building server page. The stereotype
«builds» is applied to associations and is always drawn in the model as a
unidirectional association from a server page to a client page (Figure 4.). It
indicates which server page is responsible for building a given client page.
Figure 4 Server pages build client pages
It is conceivable that a given server page might build two distinctly
different client pages. So different in fact, that it would be better (clearer)
to represent the result as two separate client pages in the model. The merits of
such design decisions may be debated, however, the extension mechanism doesn’t
prevent such constructs.
Another facility of some web application development technologies is the
ability to redirect the processing requests to another «server page».
This relationship can be expressed in the model with the «redirects» association
stereotype. A design might identify a particular server page a gatekeeper page,
that given certain input, redirects further processing (and consequently
different returning user interfaces). Redirection is a very useful feature for
re-use especially in non three tiered web applications. Figure 5 shows this
relationship.
Figure 5 Server pages can delegate
A fundamental and yet subtle relationship between client pages and server
pages is in the implementation diagram. Components in the implementation diagram
represent distributable pieces of the system. For these stereotypes it is the
web page. A component in an implementation diagram (component view in Rational
Rose) represents an actual file that is request-able by the web server, and
which realizes at least one server page or client page. Figure 6. conceptually
shows this relationship.
Figure 6 A web page components realizes both server and client pages
An additional relationship that may be of importance in web application
design is the hyper link. Client pages often contain hyper links (anchors) to
other web pages. These other web pages can be either server or client pages,
since ultimately it is the component that is requested by the client browser. If
the requested component realizes a server page (at most one) then the server
page is processed in order to get a resulting client page to fulfill the
browser’s request. If not the web server simply retrieves the requested
component (file) and sends it back to the browser.
The stereotype: «links» is defined for associations between client pages and
other pages (server or client). See Figure 7. The decision to model all the
hyper links in client pages is really left to the designer, however, a good
design should model all relevant hyper links to the functioning of the
application. It may not be necessary to model hyperlinks to web pages outside
the system, or to company home pages and the like. A «links» association may be
a bi-directional association. A «links » relationship from a server page does
not make sense. If the hyper link includes parameters, they are be modeled as
link attributes off of the association, as shown in Figure 8.
Figure 7 Client pages can like to each other
Figure 8 Linking with parameters
Components
Components in the sense of interfaces available to objects in the web
application like ActiveX controls and DLLs, Java Applets or executables are also
given a stereotype in the web extension. Just with pages components are
identified as being executed on the server machine or on the client machine. The
stereotype’s «server component» and «client component» can be applied to classes
in the design model to distinguish availability. Certainty a database access
component on the server is not directly accessible by client scripts running in
a browser. Stereotyping components (interfaces) in the design model helps to
clarify the model.
Forms
Additional stereotypes are defined for separating and elaborating HTML Form
usage. Forms in an HTML formatted page really represent a distinct part of the
client page. Forms contain additional attributes that may not be appropriate in
the context of the entire client page. It is also possible to have multiple
forms in a single page, each targeting a different action page. This can
be modeled by creating a new stereotyped class to represent a single HTML form;
«form».
A form class has as attributes is field elements. Methods however do not
apply to forms, as a method defines a dynamic behavior within the context of a
single form. Methods in a client page have access to all attributes of forms
contained within a page. The proper relationship between a client page and a
form is containment. Client pages contain forms.
A form identifies a specific web page (almost always one with a server page
stereotype) to accept and process data submitted with the form. A «submits»
association stereotype represents the relationship between a form and the web
page that processes it, see Figure 9. The association is bi-directional since
the processing page has access to the form’s attributes, which are submitted
when the association is realized during runtime.
Figure 9 Forms submit to server pages
Framesets
An additional user interface (and design element) available in web
applications is the frame. If used in an application, it represents an
ability to present multiple web pages at the same time. Typically these
concurrent pages which are related together to represent a single user
interface.
Frames are implemented in HTML by defining a frameset. A frameset
specifies and optionally names separate frames in which web pages can be
rendered. The implementation of a frameset is in an HTML page. To maintain
compatibility with older non-frames capable browsers, a frameset web page
usually contains formatting and informational content that is only seen on the
older browsers. This leads us to model framesets as a client page, but a
specialized one, and hence a new stereotype: «frameset». In a design model
classes stereotyped frameset may have all the associations that a client page
can have, with the understanding that these are only appropriate for older
browsers.
More typically, framesets contain multiple client pages. Any client page can
be contained by a frameset. Since a frameset is just a specialization of a
client page, it too can be contained in a frameset!
Coordinating activity between pages in frames (or other windows) requires the
ability to reference pages inside of frames. Target is the term used when a
client page references another active web page or frame. Since targets represent
a very different element from a frameset, and considering web pages can also
reference targets that are in other opened browsers, another class stereotype is
defined; «target». A target has no properties or attributes, it is merely a
reference-able container for a client page. A frameset class can contain a
target, or a target can exist independently (as in the case of a separate
browser window).
The main advantage of pulling out target as a stereotype is that it can be
shared and referenced by many client pages. Additionally since it has no
meaningful attributes or methods its semantics are different from normal
classes.
A final stereotype needs to be defined for associations that indicate that
one client page is requesting a link to be loaded into a browser window other
than itself. A «targeted link» stereotype is applied to associations between
client pages and targets that they interact with. Parameters that are passed to
the server with the targeted link can be identified with a UML link attribute.
See Figure 10.
Figure 10 Using framesets and targets
Other Stereotypes
At the present time the web extensions for UML are just being finalized for
their initial release. Under consideration are the following class
stereotypes;
One user interface feature of a web application not addressed yet is the
ability of a client page to pop up a modal dialog box, to gather user input.
Client scripts invoke this operation, and this is modeled with a directional
association from a calling client page to another web page (server or client).
The association is stereotyped «dialog», indicating that the client page calling
the dialog is temporarily suspended until the called "dialog" page closes.
Typically a client page calls a dialog page to get small amounts of user input
that either don’t fit on the original client page or is so infrequently used
that it isn’t worth the screen real estate to put there.
The last stereotype under consideration is «event» and is the only method
stereotype of the extension. It applies to methods in a client page object that
are handlers to browser generated events. These help the designer to visualize
which behaviors in a client page can happen asynchronously.
Knowing the semantics of design models is certainly only part of the process
of delivering quality software. Before concluding our discussions on modeling
web application design I would like to offer a few words of advice regarding the
steps to take when developing web application designs.
A web application is a specialized version of a client/server application.
Many of the development activities appropriate for client/server application
development also apply to web applications. The importance of requirements
modeling can not be emphasized enough. Use Cases provide an excellent way to
capture and model a systems requirements. Although not all requirements are
present in use cases, they do provide an excellent mechanism for organizing
specific functionality and especially user interaction.
A use case is a textual description of the interaction of the system and the
user, in the language of the domain. How tightly coupled use cases are with user
interfaces is yet another matter of great debate. It should be sufficient to say
that an examination of a system’s use cases provides an excellent starting point
to identify client pages in the application. For the first iteration of design
it is a good idea to start off with one client page for each use case involving
interaction between a human actor and the system.
Do not attempt to model server pages yet. Especially in three tiered systems,
server pages are only the glue between client pages and server components. The
next step is to model the business objects of the system. With the business
objects modeled and the client pages identified, begin gluing them together with
server pages. Most likely there will be one server page for each client page
that contains business object data.
Like any realistic project there will be several rounds of iteration and
possibly significant changes in the design. Using the extension presented
earlier will help present to the designer and implementer a clearer picture of
the actual web application design.
The major purpose of this paper is to present a mechanism to assist web
application designers. With the assumptions that modeling is important, and we
should modeling the artifacts of a system, it becomes obvious that web
application designers must work with pages. Since UML is fundamentally object
oriented, and web pages are inherently not a disconnect arises. The solution, as
presented here, is to apply some new semantics to special model elements, to
help bring out their hidden object oriented aspects.
As a final note, it has to be mentioned that the extensions presented here
are still in development. For the most part they have been driven my recent
experience developing web applications with the Microsoft set of technologies.
Its hoped that the user community will help refine the extension and additional
processes such that in the near future it can considered a reliable tool in the
creation of web applications.
Appendix A. Stereotype Icons